Thursday, 06/01/2023
10:17
Client-side and server-side rendering are both necessary to make the best websites.
The most important role of a website is to communicate and present data in the best way possible. The best tool is often a static document; this allows you to communicate information that doesn't frequently change to a user.
The ability to take a snapshot - to download a single HTML file and have access to all of the information you'd like to see - allows users to save web documents for themselves and access them whenever they'd like. It's really important for websites to provide static data with the lowest lift possible. This allows snapshot tools to perfectly capture their state, giving the users of a website the ability to communicate data online or offline.
What if that information frequently changes, though?
I've written before about the 'three stages' of information on the web. Information can be retrieved at three times: at site deployment time (the developer deploying the site to a server), at user access time (when the user requests to see the information from the server by clicking a link), and at runtime (updating data while the user is viewing a page).
We know the user wants the most up-to-date information, but each stage comes with a performance penalty; delivering information at access time and runtime can introduce significant lag if not approached properly, as the live data has to be retrieved.
We can load data in 'after the fact' by having the browser request live data again after a page loads. This is a super common React strategy, and improves load times for the user - but means that the page served to the user initially is often kind of useless (it has none of the relevant data until a user spends some time on it!), preventing any sort of archival tool from properly preserving the site at a point in time.
This also may be irresponsible - do I want to render the data on my one computer or on the computers of every single one of my visitors? One clearly is much more expensive. We need to give users the most relevant live data though!
When considering how a platform is built, strive to store all information at site deployment time. If information might change between user access times, that data will have to be dynamically retrieved. If information might change when a user accesses the page, the data will have to be dynamically rendered by a client.
This also calls for three different ways of rendering a website. The first stage is supported by a compiler from source files to target files. The second stage is supported by a service that pulls in live data, sticks that data into a compiler pipeline, and sends the output over the wire. The third uses JavaScript to continuously request and render data from the user's computer.
Because rendering information at different times has these tradeoffs, switching between different rendering strategies for particular portions of the website should be as easy as possible. If I want my data to render statically but update live, I will have to render that data in two places - on the server and on the computer of the user. I will also need to obtain that data in both of those places - ideally from the same source.
How do we solve this?
- The compiler can render information with any language.
- The server can render information with any language.
- The web renders information with javascript.
Cool. The UI development language has to either be javascript or support javascript.
What about alternative rendering strategies? What if I want my app to render easily on desktop and web?
If we want to draw with pixels, we can 'sideload' rendering on the web in with the HTML canvas. This would allow users to program in their language of choice. This also sacrifices all of the tools that the web browser provides and prevents static rendering entirely (it is not possible to draw a canvas statically).
If we want to draw with the GPU instead of just putting pixels on the screen, a presumably faster strategy, we can program against the WebGPU API on both the web and the desktop - but again we lose all of those advantages of HTML on the web.
Cool, maybe we can bring the web to us. Let's wrap our app in a web browser and have users download our application code, then tell the browser to render that.
Some problems:
- Web browsers are huge - several hundred megabytes at the least. It is irresponsible to ship an application that's probably a few thousand lines of code (< 1 mb) as a 300mb app.
- Web browsers update frequently. Many of these updates introduce new APIs or fix security vulnerabilities. The former is fine - we can avoid those APIs - but unknowingly leaving outdated, vulnerable code on the computers of our users that we cannot easily fix is rough.
- The web API may not be the best paradigm for rendering. If I need expressive and performant 3d tools, GPU through JS might be too slow. I want the browser's native hardware and optimized, low-level code.
Because our documents are glued to the browser - the most expressive document viewer ever - everyone expects their applications to be accessible there, too. Links are really powerful. Requiring a user to download software to try it out simply is not a good option nowadays.
This is why comprehensive rendering solutions are so difficult. There are two APIs we have to glue into if we want both fast and browser-optimal code to be available everywhere, both with very large surface areas.
We have two paths to move forward:
- Reinvent the wheel. Deny that people use browsers and require users to download new software that reinvents the idea of the browser as a platform. We can write optimal code with good APIs that runs everywhere. For portability, we can pre-compile software for every platform that uses it and statically link it, or we can distribute a virtual machine that our software runs on. This requires significant user buy-in, but it means that we can ship native-feeling apps with small application sizes that are available off of link. We also lose all of the external work in the browser on extensions or other development tools. A compatibility layer to HTML canvas can be implemented here, but that loses all of the accessibility features and lots of the performance benefits.
- Develop against a very large API surface. We have red functions and blue functions that can't be mixed - HTML/CSS compatible functions and WebGPU/Canvas compatible functions. Somehow both types have to be both executable on the browser and on the server. We have to preserve the information about the context in which these can be used - GPU has to happen inside of a canvas, which has to be inside of HTML - which means we're glued to a strongly typed language if we want to produce code that has a decent performance profile.
Things that are not up for discussion:
- Reactivity. The consensus is that reactive frameworks are obviously good for displaying complex information. Parts of every application should have the ability to be written with a high-level, reactive API because this is such an expressive paradigm and development velocity win. Imperative GUI modes are best for real-time rendering - and can be more performant in some cases - but the cacheing control that reactive frameworks provide can also save lots of compute that we don't want to spend if we don't want to re-render something complex.
- Expressive rendering with GPU. We want the most performant software possible. The web is disrespectful.
All I'm saying is that reinventing the wheel is looking really good right now...
11:14
Why can't we just target HTML/CSS with business logic in JS / WebAssembly?
We always want accessibility hints, and we always want debuggability - a document flow is ideal for those. A lot of the time, though, the web presents problems to us. The DOM cannot render pixel-perfect documents without the canvas.
Google Docs moved to render entirely with canvas recently, and though they didn't state why, I have some suspicions:
- Implementing an expressive, interactive layout engine with the DOM is really rough. You want to be able to shift margins and boxes by specific pixel sizes and make adjustments at different scales. The DOM becomes a bottleneck.
- Font rendering on the web is a moving target. You aren't in control of the font rendering strategy that your client's browser uses, so you can't control what font rendering primitives they have access to, if they can support variable fonts or certain points or certain glyphs. Downloading fonts only solves half the battle, and injecting a custom renderer for text is reinventing the wheel but in a more complex way.
- Fonts and layout engines interact in really complex ways; it's been hard to get this right at work, even for our web application that isn't doing anything unique at all with fonts or font rendering. Rendering all of the fonts and the layout to canvas allows the implementer to be in complete control of rendering logic - not the browser.
- Interacting with the DOM prevents you from being completely in control of your data sync story. Google Docs wants to always render real-time synced text and formatting data. To change how the DOM looks, code has to iterate through all of the DOM nodes, making small changes and adjustments. The two ways of doing this - modifying the existing DOM to incorporate the new changes in real time and completely re-flowing the doc - introduce significant performance bottlenecks. Low latency for text documents is incredibly important. Application sync over the internet seamlessly is really important to their real-time, collaborative platform. They cannot afford to take the performance hit that DOM re-flowing incurs.
The docs team also added a feature to support static web rendering via the DOM. This allows those live, view-only previews and snapshots to be taken, efficiently rendering a static site that is served to others without the issues discussed. Unfortunately, they have to write all of the same code twice - one for the static doc that's distributed to others and another for canvas editing version.
Thankfully, the canvas doesn't sacrifice all of the browser tools - its api does offer some accessibility tags and primitives: https://pauljadam.com/demos/canvas.html.
This means that if we want to give application developers expressive and fast tools, we cannot rely on DOM rendering to support every use case. There must be a seamless way for them to fall back to pixels. The canvas API, I'd argue, is not seamless - those accessibility hints and tools cannot be rendered statically in documents, for one (unless you count images and SVGs - but then you sacrifice the interactivity that makes HTML docs so brilliant to an opaque image).
The conclusion here is basically that we need to be able to develop custom, pixel-perfect tools within the canvas that the browser will render statically to a document, but that can be interactive when that document is open. I haven't explored why static HTML - rather than JS augmented HTML - is important, but mostly because JS is a mess and is too expressive for what we want it for most of the time. Documents should be usable without executing a general purpose programming language - users should never have to incur that performance hit.
I'll have to rewrite this whole article when formalizing it.
- public document at doc.anagora.org/2023-06-01
- video call at meet.jit.si/2023-06-01